AITopics | transfer regret

Collaborating Authors

transfer regret

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Meta Learning in Bandits within Shared Affine Subspaces

Bilaj, Steven, Dhouib, Sofien, Maghsudi, Setareh

arXiv.org Machine LearningMar-31-2024

In the applications mentioned above, the tasks often relate to each other despite being different. For instance, subgroups of patients have comparable features. As another We study the problem of meta-learning several example, holidays or discount periods promote similar interests contextual stochastic bandits tasks by leveraging in the products of an e-commerce website. That observation their concentration around a low-dimensional motivates us to look beyond a single task to uncover affine subspace, which we learn via online principal a relation between different ones to accelerate learning component analysis to reduce the expected on newly encountered tasks. That problem, referred regret over the encountered bandits. We propose to as meta-learning or learning-to-learn (LTL), has mainly and theoretically analyze two strategies that solve appeared in the offline learning literature so far (Hutter the problem: One based on the principle of optimism et al., 2019). Nevertheless, an emergent body of literature in the face of uncertainty and the other via combines LTL and MAB to accelerate learning and reduce Thompson sampling. Our framework is generic the average regret per task (Cella et al., 2020; Cella and and includes previously proposed approaches as Pontil, 2021; Bilaj et al., 2023).

algorithm, inequality, learning, (16 more...)

arXiv.org Machine Learning

2404.00688

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Genre:

Research Report (0.50)
Instructional Material (0.46)

Industry:

Education (0.66)
Information Technology > Services > e-Commerce Services (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Meta Learning MDPs with Linear Transition Models

Müller, Robert, Pacchiano, Aldo

arXiv.org Artificial IntelligenceJan-21-2022

We study meta-learning in Markov Decision Processes (MDP) with linear transition models in the undiscounted episodic setting. Under a task sharedness metric based on model proximity we study task families characterized by a distribution over models specified by a bias term and a variance component. We then propose BUC-MatrixRL, a version of the UC-Matrix RL algorithm, and show it can meaningfully leverage a set of sampled training tasks to quickly solve a test task sampled from the same task distribution by learning an estimator of the bias parameter of the task distribution. The analysis leverages and extends results in the learning to learn linear regression and linear bandit setting to the more general case of MDP's with linear transition models. We prove that compared to learning the tasks in isolation, BUC-Matrix RL provides significant improvements in the transfer regret for high bias low variance task distributions.

artificial intelligence, machine learning, transfer regret, (15 more...)

arXiv.org Artificial Intelligence

2201.08732

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Meta-learning with Stochastic Linear Bandits

Cella, Leonardo, Lazaric, Alessandro, Pontil, Massimiliano

arXiv.org Machine LearningMay-18-2020

We investigate meta-learning procedures in the setting of stochastic linear bandits tasks. The goal is to select a learning algorithm which works well on average over a class of bandits tasks, that are sampled from a task-distribution. Inspired by recent work on learning-to-learn linear regression, we consider a class of bandit algorithms that implement a regularized version of the well-known OFUL algorithm, where the regularization is a square euclidean distance to a bias vector. We first study the benefit of the biased OFUL algorithm in terms of regret minimization. We then propose two strategies to estimate the bias within the learning-to-learn setting. We show both theoretically and experimentally, that when the number of tasks grows and the variance of the task-distribution is small, our strategies have a significant advantage over learning the tasks in isolation.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

2005.08531

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback